Using deep belief networks for vector-based speaker recognition
نویسنده
چکیده
Deep belief networks (DBNs) have become a successful approach for acoustic modeling in speech recognition. DBNs exhibit strong approximation properties, improved performance, and are parameter efficient. In this work, we propose methods for applying DBNs to speaker recognition. In contrast to prior work, our approach to DBNs for speaker recognition starts at the acoustic modeling layer. We use sparse-output DBNs trained with both unsupervised and supervised methods to generate statistics for use in standard vector-based speaker recognition methods. We show that a DBN can replace a GMM UBM in this processing. Methods, qualitative analysis, and results are given on a NIST SRE 2012 task. Overall, our results show that DBNs show competitive performance to modern approaches in an initial implementation of our framework.
منابع مشابه
i-Vector Modeling with Deep Belief Networks for Multi-Session Speaker Recognition
In this paper we propose an impostor selection method for a Deep Belief Network (DBN) based system which models i-vectors in a multi-session speaker verification task. In the proposed method, instead of choosing a fixed number of most informative impostors, a threshold is defined according to the frequencies of impostors. The selected impostors are then clustered and the centroids are considere...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملA hybrid EEG-based emotion recognition approach using Wavelet Convolutional Neural Networks (WCNN) and support vector machine
Nowadays, deep learning and convolutional neural networks (CNNs) have become widespread tools in many biomedical engineering studies. CNN is an end-to-end tool which makes processing procedure integrated, but in some situations, this processing tool requires to be fused with machine learning methods to be more accurate. In this paper, a hybrid approach based on deep features extracted from Wave...
متن کاملCombining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)
Commercial quadcopters with many private, commercial, and public sector applications are a rapidly advancing technology. Currently, there is no guarantee to facilitate the safe operation of these devices in the community. Three different automatic commercial quadcopters identification methods are presented in this paper. Among these three techniques, two are based on deep neural networks in whi...
متن کاملModeling speaker variability using long short-term memory networks for speech recognition
Speaker adaptation of deep neural networks (DNNs) based acoustic models is still a challenging area of research. Considering that long short-term memory (LSTM) recurrent neural networks (RNNs) have been successfully applied to many sequence prediction and sequence labeling tasks, we propose to use LSTM RNNs for modeling speaker variability in automatic speech recognition (ASR). Firstly, the LST...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014